Self-Contained Statistical Analysis of Gene Sets
نویسندگان
چکیده
Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher's self-contained method for gene set analysis. We improve Fisher's differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients.
منابع مشابه
Gene-set analysis and reduction
Gene-set analysis aims to identify differentially expressed gene sets (pathways) by a phenotype in DNA microarray studies. We review here important methodological aspects of gene-set analysis and illustrate them with varying performance of several methods proposed in the literature. We emphasize the importance of distinguishing between 'self-contained' versus 'competitive' methods, following Go...
متن کاملAn Application of Gene Set Analysis for a Comparison of Two Groups
Background: Microarrays are biotechnological advancements measuring expressions of thousands of genes in a single assay. A two-group microarray study yields gene expression measurements for patients with a disease of interest and for healthy controls. Successful identification of genes differentiating between the two groups leads to new and improved treatments. While microarrays represent an ex...
متن کاملStatistical power of gene-set enrichment analysis is a function of gene set correlation structure
We develop an analytic statistical framework for examining a variety of gene-set enrichment analysis tests. Within this framework, we describe why statistical power for both self-contained and competitive gene set tests is a function of the correlation structure of co-expressed genes, and why this characteristic is undesireable for gene-set analyses. We additionally describe why past gene-set t...
متن کاملEstimation of Gene Induction Enables a Relevance-Based Ranking of Gene Sets
In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either ...
متن کاملA decision-theory approach to interpretable set analysis for high-dimensional data.
A key problem in high-dimensional significance analysis is to find pre-defined sets that show enrichment for a statistical signal of interest; the classic example is the enrichment of gene sets for differentially expressed genes. Here, we propose a new decision-theory approach to the analysis of gene sets which focuses on estimating the fraction of non-null variables in a set. We introduce the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 11 شماره
صفحات -
تاریخ انتشار 2016